GOLD: Gaussians of Local Descriptors for image representation

نویسندگان

  • Giuseppe Serra
  • Costantino Grana
  • Marco Manfredi
  • Rita Cucchiara
چکیده

The Bag of Words paradigm has been the baseline from which several successful image classification solutions were developed in the last decade. These represent images by quantizing local descriptors and summarizing their distribution. The quantization step introduces a dependency on the dataset, that even if in some contexts significantly boosts the performance, severely limits its generalization capabilities. Differently, in this paper, we propose to model the local features distribution with a multivariate Gaussian, without any quantization. The full rank covariance matrix, which lies on a Riemannian manifold, is projected on the tangent Euclidean space and concatenated to the mean vector. The resulting representation, a Gaussian of local descriptors (GOLD), allows to use the dot product to closely approximate a distance between distributions without the need for expensive kernel computations. We describe an image by an improved spatial pyramid, which avoids boundary effects with soft assignment: local descriptors contribute to neighboring Gaussians, forming a weighted spatial pyramid of GOLD descriptors. In addition, we extend the model leveraging dataset characteristics in a mixture of Gaussian formulation further improving the classification accuracy. To deal with large scale datasets and high dimensional feature spaces the Stochastic Gradient Descent solver is adopted. Experimental results on several publicly available datasets show that the proposed method obtains state-of-the-art performance. ∗Corresponding author Email addresses: [email protected] (Giuseppe Serra), [email protected] (Costantino Grana), [email protected] (Marco Manfredi), [email protected] (Rita Cucchiara) Preprint submitted to Computer Vision and Image Understanding March 2, 2015

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Second-Order Statistical Texture Representation of Asphalt Pavement Distress Images Based on Local Binary Pattern in Spatial and Wavelet Domain

Assessment of pavement distresses is one of the important parts of pavement management systems to adopt the most effective road maintenance strategy. In the last decade, extensive studies have been done to develop automated systems for pavement distress processing based on machine vision techniques. One of the most important structural components of computer vision is the feature extraction met...

متن کامل

Selecting Local Region Descriptors with a Genetic Algorithm for Real-World Place Recognition

The basic problem for a mobile vision system is determining where it is located within the world. In this paper, a recognition system is presented that is capable of identifying known places such as rooms and corridors. The system relies on a bag of features approach using locally prominent image regions. Realworld locations are modeled using a mixture of Gaussians representation, thus allowing...

متن کامل

A Novel Method for Content Base Image Retrieval Using Combination of Local and Global Features

Content-based image retrieval (CBIR) has been an active research topic in the last decade. In this paper we proposed an image retrieval method using global and local features. Firstly, for local features extraction, SURF algorithm produces a set of interest points for each image and a set of 64-dimensional descriptors for each interest points and then to use Bag of Visual Words model, a cluster...

متن کامل

A Novel Method for Content Base Image Retrieval Using Combination of Local and Global Features

Content-based image retrieval (CBIR) has been an active research topic in the last decade. In this paper we proposed an image retrieval method using global and local features. Firstly, for local features extraction, SURF algorithm produces a set of interest points for each image and a set of 64-dimensional descriptors for each interest points and then to use Bag of Visual Words model, a cluster...

متن کامل

Feature extraction and representation for human action recognition

Human action recognition, as one of the most important topics in computer vision, has been extensively researched during the last decades; however, it is still regarded as a challenging task especially in realistic scenarios. The difficulties mainly result from the huge intra-class variation, background clutter, occlusions, illumination changes and noise. In this thesis, we aim to enhance human...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Vision and Image Understanding

دوره 134  شماره 

صفحات  -

تاریخ انتشار 2015